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Abstract. Searching for optimal ways in a network is an important 
task in multiple application areas such eis social networks, co-citation 
graphs or road networks. In the majority of applications, each edge in a 
network is associated with a certain cost and an optimal way minimizes 
the cost while fulfilling a certain property, e.g connecting a start and a 
destination node. In this paper, we want to extend pure cost networks 
to so-called cost-gain networks. In this type of network, each edge is 
additionally associated with a certain gain. Thus, a way having a certain 
cost additionally provides a certain gain. In the following, we will discuss 
the problem of finding ways providing maximal gain while costing less 
than a certain budget. An application for this type of problem is the 
round trip problem of a traveler: Given a certain amount of time, which 
is the best round trip traversing the most scenic landscape or visiting the 
most important sights? In the following, we distinguish two cases of the 
problem. The first does not control any redundant edges and the second 
allows a more sophisticated handling of edges occurring more than once. 
To answer the maximum round trip queries on a given graph data set, wc 
propose unidirectional and bidirectional search algorithms. Both typos of 
algorithms are tested for the use case named above on real world spatial 
networks. 



1 Introduction 

Searching for optimal ways in networks is an important task in many application 
areas. In most cases, an optimal way is a way minimizing the cost while fulfilling 
a certain property. The cost is usually connected to traversing the edges being 
part of the way. For example, in road networks the cost of each edge might be 
considered as the time it takes to traverse the edge. Finding the fastest route 
between two nodes can now be defined as finding the route with the property to 
connect both points and additionally having a minimum cost. In social network 
analysis, the cost is often measured in hops and the shortest path between two 
members in the network is considered as a measure for the strength of their 
relationship. Other application areas for networks might be co-citation networks 
and protein interaction networks where shortest paths can be used to express 
similarity. In all of these domains it is expected that traversing a certain edge is 
connected to a certain cost. 



In this paper, we like to extend this view by additionahy connecting the 
traversal of an edge to a certain amount of gain. In our running example, we 
consider a traveler who wants to take a hike in a nature resort. Usually, the 
traveler wants to see as much of the landscape as possible. Thus, walking on a 
scenic trail with a good view and passing by important landmarks represents 
some gain. However, our traveler can usually only walk for a certain period of 
time until (s)he wants to be back to the starting point, e.g. the parking lot. Thus, 
the task is to maximize the gain of our traveler while not spending more than 
a certain amount of time. Another application for the proposed query type is a 
car announcing a blood donation event at the local hospital. The hospital has 
hired the car with the speaker for a limited period of time only and now wants 
to find a route reaching as many people willing to donate blood as possible. Let 
us note that the problem can be easily extended to searching a way ending at 
a different destination than the starting point. Though we will mainly focus on 
the case of round trips, the proposed query processing can be easily extended to 
the more general case. 

A round trip in our definition is a way and thus, it is allowed to pass by the 
nodes and edges more than once. Thus, it is different from finding Eulcr or Hamil- 
ton paths in a graph. An important issue about our problem definition is the 
possibility to traverse the same edge for an arbitrary amount of times. Depend- 
ing on the given task, this might lead to round trips having a large redundancy 
in the number of visited edges. Therefore, we distinguish our problem further 
into two sub problems: The first allows edges to be visited several times and thus 
simply maximizes the gain while keeping the cost below the given threshold r. 
In the second setting, the definition of round trips is extended by allowing that 
each edge in the round trip is visited at most k times. Furthermore, each edge 
is considered only once when calculating the gain without consideration of any 
additional traversal. 

To solve both problems, we will introduce algorithms that determine a maxi- 
mum gain round trip for all cost budgets being smaller than a certain threshold r. 
To tackle this computationally complex task for practically relevant cost thresh- 
olds, we will introduce pruning mechanisms and bidirectional search methods. 
The proposed algorithms are evaluated by searching round trips on real world 
map data, obtained from OpenStreetMap. 

The rest of the paper is organized as follows. Section [2] reviews related work. 
In sectionjSj we define preliminaries and our new queries. Section[4]and section[5] 
introduce pruning methods and search algorithms. Section [6] evaluates our algo- 
rithms on real world data w.r.t. runtime and various parameter settings. Finally, 
section [7] summarizes the paper and outlines directions for future research. 

2 Related Work 

Common route search which starts from a single source node to at least one 
target node is also known as the single-source problem. This problem has been 
studied very extensively for a long time |1|2)3)4|5|6)7|9|11|15|17) . Also the task 



of finding not just tlie shortest or fastest route but tlie top k routes lias achieved 
quite some interest and has been studied for several years e.g. in |16) . 

The closest scenario to the cost-gain networks discussed in this paper are 
multi-attribute or multi-cost networks that associate multiple types of cost to 
traversing a single edge, e.g. length and average speed. In [T^], the authors in- 
troduce preference queries in such multi-cost networks. In particular, the paper 
proposes ranking and skyline queries to compute and sort a result set of possible 
target destinations in a multi-cost transport network. This work is different from 
the work presented in this paper because the query result consists of possible 
destination locations. Another related work is presented in |13j . In this work, 
the authors also work on a multi-cost or multi-attribute network and compute 
a skyline operator on the paths leading from a given starting point to a des- 
tination. The query result consists of all paths from the starting node to the 
destination node having a pareto optimal cost vector. The important difference 
to the queries in this paper consists in the use of gain attributes. Using gain has 
a major impact on the characteristics of the solutions. In cost and multi-cost 
networks optimal solutions always consist of shortest paths w.r.t. some possibly 
combined cost function. In a cost-gain network, the impact of traversing an edge 
needs to be measured w.r.t. the provided gain as well. Thus, optimal solutions 
do not need to be paths but can be ways. For example, the skyline operator 
proposed in [13' cannot generate interesting round trips because leaving an edge 
cannot decrease any cost value compared to the trivial solution of simply staying 
a the starting node. Thus, the trivial solution would always dominate any other 
way leading away from the starting node. 

To the best of our knowledge, there exists no other work that formulates the 
search for optimal round trips while employing a cost-gain network. In theoretical 
computer science, there exist methods for finding all cycles present in a graph 
that will contain the most interesting round trips or parts of it. For example, 
[TO] deals with the task of finding a complete cycle base in a graph. However, 
finding a complete cycle base is a different task and does not consider cost and 
gain values for the different edges. 

3 Cost-Gain Networks, Round Trips and Queries 

A network is represented by a graph where the edges have two attributes, cost 
and gain. Thus, we call this graph a cost-gain network: 

Definition 1 (Cost-Gain Network (CGN)). 

A cost-gain network is a graph G{V, E, cost, gain) where V is denoting the set of 
vertices and E d V x V is denoting the set of edges. 

cost : E — > IR*|^ is called cost function where cost{e) denotes the non-negative 
cost for an edge e £ E. 

gain : E — !■ lR'j_ is called gain function where gain{e) denotes non-negative gain 
for an edge e £ E. 

In our running example, a CGN represents a network of roads, streets, paths, 
sidewalks and trails. The nodes correspond to crossings, the edges correspond to 



path segments in the graph. The cost of a segment represents a certain combina- 
tion of the characteristics of this segment, hke the length, the maximum speed, 
the time needed to pass the segment, the inchnation etc. The gain of a segment 
can be defined correspondingly by combining all characteristics that the user 
considers as beneficial, e.g. a trail that is not used by cars might be considered 
as more attractive for a hiker than a highway segment etc. 

Let us also note that the attributes of an edge = (^^-sj^d) need not be the 
same as edge e = (ud, ng) even though both edges describe the exactly same 
path just in different directions as the user might for example avoid declines of 
a certain degree whilst accepting inclines of some other degree. Also the impact 
on travel speed for this edge might be different. 

Definition 2 (way). A way w is a sequence of edges {{vi,V2), {v2,V3),. . ., 
{vk-i,Vk)) where the following condition holds: 

\/l <i < k -.Be € E : e = {vi, v^+l) (1) 

The cost of a way w is defined as follows: 

fe-i 

cost{w) = ^ cost{{vi, Vi+i)) (2) 

i=l 

The gain of a way w is defined as follows: 

k-l 

gain{w) = ^gain{{vi,Vi+i)) (3) 

In other words, a way is a sequence of connected edges. A round trip is a 
way starting and ending with the same node s. 

Definition 3 (round trip). A way 

w = {{vi,V2)-i {v2,V3), ■ ■ ■ , ivk-i-iVk)) is called round trip if Vi = v^- 

In the above definition of a round trip, there is no consideration of whether 
or not a round trip contains one and the same edge more than once. However, 
passing the same edge very often does not yield an additional benefit in most 
applications. For example, for a hiker looking for the most scenic walk, it might 
be necessary to pass the same edge twice, if some edges having a large gain 
are placed in a dead end. However, after visiting a spot and returning on the 
same path, it would not make any sense to go there again. Thus, a possibility to 
control the redundancy is to limit the number of times an edge can be contained 
in the round trip. Furthermore, traversing the same edge more than once should 
not contribute to the gain of the round trip. Thus, we can extend our definition 
of round trips to round trips with redundancy control. Formally, this can be 
formulated as follows: 



Definition 4 (Redundancy Control). Given the 

round trip r in the CGN G(y, E , cost, gain) , r is called a round trip under re- 
dundancy control with level A: € IN, if the following condition holds: 

y{va,Vb) G r : {{vi,Vi+i) G r\{vi = VaAVi+i = Vb) V {vi = A v^+i = Va)}\ < k 

For a round trip r under redundancy control, the gain is calculated as follows: 

gain{r) = ^ gain{{vi,ViJ^i)) with ES{r) = {e€r} 

eeES(r) 

After defining both types of round trips, we can now define maximum gain 
round trip queries: 

Definition 5 (Maximum Gain Round Trip Query). 

Given the GGN G{V, E, cost, gain), a starting node s € V and cost threshold 
T e H^, the result of a maximum gain round trip query (MGRQ) is the set R 
of round trips, such that for each element r G R the following constraints hold: 

(a) cost(r) < T 

(b) Vr, f e R : gain(r) < gain(f) 4^ cost{r) < cost{f) 

In other words, the result of a MGRQ contains a round trip providing maxi- 
mum gain for each cost level being smaller than r. Thus, the result set contains 
all pareto optimal ways starting and ending at s. 

4 Algorithms for MGRQs 

In this section, we will examine the problem of searching maximum gain round 
trips with cost constraints without controlling the redundancy. 

4.1 Pruning Round Trips 

The simplest pruning condition for a way w = ((ui,2), {vk-i,Vk)) dur- 
ing our search is that the way cannot be extended into a round trip r = 
((''^1j2 )i • • • j ('i'fe-ii ^^fe)j • • {^1, vi)) with cost{r) < t. Thus, if cost{w) > t, w can 
be pruned because the cost of any extension of w must be larger than cost{w). 

Let us note that pruning w.r.t. gain is not that simple. Extending a way 
usually increases the gain and thus, there is no upper limit for the gain of a 
way. However, since we are only looking for round trips having an optimal cost- 
gain ratio, we can use the following observation for defining a further pruning 
criterion: For each r € R, we can guarantee that there is no other round trip 
generating more gain and having at most the same cost. Therefore, each partition 
w of the result round trip r has a pareto optimal cost gain ratio w.r.t all other 
ways starting and ending at the same position as w. The intuition behind this 
conclusion is that each part of the round trip serves three purposes: it moves 
the traveler to the end of the way, it generates a certain amount of cost and it 
provides a certain amount of gain. Apart from this, the properties of the way do 
not influence the properties of the complete round trip. Formally, we can define 
the following pruning rule based on the domination relationship: 



(a) 



cost 

(b) 



Fig. 1. Local pruning of pareto optimal paths: if w is dominated by w, any 
further extension A oi w and w does not change the pruning relation so that 
w + A is still dominated by w + A. 

Lemma 1 (local pruning by domination). Letw = {(vi,V2), • ■ • , {vk^i,Vk)) 
be a way in G{E,V, cost, gain), then w is dominated by another way w = 
{{vi,V2) ,. ■ ., {vk-1, Vk)) if either cost{w) > cost{w) A gain{w) < gain{w) or 
cost(w) > cost(w) A gain^w) < gain^w). If w is dominated by w, then w cannot 
be extended into an element of the result set of an MGRQ for vi. 

Proof. Consider the result round trip r with gain{r) and cost{r). Due to the 
definition of the result set, we can rule out that there is another round trip f 
having at most the same cost and more gain or at least the same gain and less 
cost. Now, consider a way w = {vi,V2, . . . ,Vi) being part of r. If there exists 
another way w — {vi,V2, ■ . . ,Vi) in r leading from Vi to Vi with cost(w) < 
cost(w) A gain{w) > gain{w) or cost{w) < cost{w) A gain{'w) > gain{w), then 
it is possible to construct a round trip r„euj by replacing w by w in r. However, 
in this case f would dominate r which contradicts the condition that r is part of 
the result set. 

An illustration of this pruning mechanism can be found in figure [l] 
4.2 A Basic MGEQ Algorithm 

In our descriptions, we will denote the edges starting at node v as outlinks of v 
while the edges ending at v are called inlinks. Our algorithm employs two data 
structures. The first is a hash table called node tab containing an entry for each 
visited node Vi . Furthermore, the node tab stores all undominated ways starting 
at s and ending at Vi. For each of these ways, we store a flag indicating whether 
we already processed the way in a previous step or not. In the following, we will 
use the expression "update the node tab with way w" for the following steps 
being used in our algorithm: 



1. Check whether w is dominated by any entry of the node tab. 

2. If w is not dominated, insert w into the node tab entry. 

3. Remove all entries from the node tab which are dominated by w. 

Our second data structure is a priority queue containing all nodes. Each node vi 
is prioritized by the maximum gain among all ways ending at Vi and the queue 
is organized in descending order. 



SimpleMGRQ(Node s. Float r) 

(1) Nodetab tab = lnitNotetab() 

(2) PriorityQueue queue — lnitQueue() 

(3) FOR EACH Link ; IN s.outlinksQ DO 

(4) Way w — new Way{s, I) (5) tah .update{w) 

(6) queue. updatei^w .last, w. gain) 

(7) END FOR 

(8) WHILE NOT queue. isEmptyQ DO 

(9) Entry entry — queue. pop() 

(10) List<Way> aktList = entrygetUndommated() 

(11) akiist — removeProcessedWays(aktli5t) 

(12) 5etProce55ed(aktli5t) 

(13) List<Way> candidates = extendWay5(aktLi5t) 

(14) FOR EACH w IN candidates DO 

(15) IF w.cost < r DO 

(16) tab.update(w) 

(17) queue. update{w. last, w. gain) 

(18) END IF 

(19) END FOR 

(20) END WHILE 

(21) RETURN tab.getEntry{s) 



Fig. 2. Pseudocode of the simple MGRQ Algorithm 



The algorithm starts by generating the ways resulting from following all out 
links of the starting node s. Afterwards the ways are used to update the node tab 
as well as the queue. Now the algorithm enters the main loop which is repeated 
until the priority queue is empty. In each iteration, the algorithm pops the top 
node from the queue and retrieves all unprocessed ways from the node tab. Let us 
note that we have to keep already processed ways in the node tab for determining 
locally dominated ways. However, it is note required to process each way more 
than once. Now, each unprocessed way is marked as processed. Afterwards we 
extend each way by all of its out links generating a set of candidate ways. The 
candidate ways are checked whether their cost exceeds the limit r. Afterwards 
each candidate c = {(s,vi), . . . ,{vk,Vnew)) is checked against the ways being 
stored in the node tab entry of their end node Vnew If c is not dominated by 
any other way, it is inserted into the node tab entry. Furthermore, if c dominates 
former members of the node tab entry, these members can be pruned due to 
their sub optimal cost gain ratio. If the maximum gain of any node tab entry 
being modified is increased, the entry has to be updated in the queue. After the 
queue is empty, the result of our query can be found in the node tab entry of 
the starting node s. Figure [2] displays the algorithm in pseudo code. 



Let us note that the above algorithm is capable to find arbitrary pareto 
optimal ways having a cost less than r and ending at any visited node. Thus, it 
is not restricted to the search of round trips. 



BidirectionalMGRQ(Node s, Float r) 


(1) 


IModetab tab — lnitNotetab(); 


(2) 


PriorityQueue queue — lnitQueue() 


(3) 


hUK hALH Link i IN s .outL'inks{) DO 


(4) 


Way w = new Way{s, i) 


(5) 


tao.upaateotart{w) 


(6) 


queue. update{w .last, w.gain) 


(7) 


END FOR 


(8) 


hUK bACH Link 1 s.inunKs{) DO 


(9) 


Way w — new Way{s, I) 


(10) 


w — w.reve7\se() 


(11) 


tab.updat e Re L u r n iw) 


(12) 


queue. update{w. first, w.gain) 


(13) 


END FOR 


(14) 


\ A f 1 III r" n 1 /^"¥" ' TT1 J- /^ /~\ 

WHILE NOT queue. ts Empty {) DO 


(15) 


Entry entry — queue.popQ 


(16) 


List<Way> fwdList — entry.getundominatedStart() 


(17) 


TwdList — removerrocessed(TwdListJ 


(18) 


setProcessed^twdList) 


(19) 


List<Way> bwdList — entry.getundonninatedReturn() 


(20) 


bwdList — removerrocessed(bwdListJ 


(21) 


set P ro cessed ( b wd L i st ) 


(22) 


FOR EACH w IN fwdList DO 


(23) 


IF w.cost > 5 DO 


(24) 


fwdList. delete{w) 


(25) 


END IF 


(26) 


END FOR 


(27) 


List< Ways > fwdCandidates — extendFwdWays(fwdList) 


(28) 


FOR EACH w IN fwdCandidates DO 


(29) 


tab. updates tart{w) 


(30) 


queue.update{w .last{) ^ w.gain) 


(31) 


END FOR 


(32) 


List<Ways> bwdCandidates = extendbwdWays(bwdList) 


(33) 


FOR EACH w IN bwdCandidates DO 


(34) 


IF w.cost < 5 DO 


(35) 


tab.updateReturn(w) 


(36) 


queue. update{w. last, w.gain) 


(37) 


END IF 


(38) 


END FOR 


(39) 


END WHILE 


(40) 


LIST< WAYS > result = new ListjwaysiO 


(42) 


FOR EACH entry IN tab DO 


(43) 


FOR EACH startWay IN entry.getundominatedStart() DO 


(44) 


FOR EACH retWay IN entry.getundominatedReturn() DO 


(45) 


Way roundtrip — startWay.extend(retWay) 


(46) 


result. update{roundtrip) 


(47) 


END DO 


(48) 


END DO 


(49) 


END FOR 


(50) 


Return result. entries 



Fig. 3. Pseudocode of the bidirectional MGRQ Algorithm 



4.3 Bidirectional Round Trip search 

A further method to improve the runtime is bidirectional search as employed in 
the well-known bidirectional Dijkstra search for shortest paths [14^ . For searching 
maximum gain round trips, bidirectional search yields an even stronger advan- 
tage. The algorithm described above generates ways having a cost of at most r. 
Thus, it has to visit any node that is reachable by spending the cost limit r. How- 
ever, a round trip has to end at its starting node. Obviously, examining a way w 
having a network distance of t cannot have a return path W2 to the starting node 
that would not exceed t. Thus, it is only necessary to explore ways wi for which 
there exists a return path W2 having at most a cost of cost{w2) ~ t — cost{wi). 

To conclude, it is only necessary to extend each way w until cost{w) exceeds 
^. In particular, we can distinguish two cases, when trying to split w into two 
partitions of equal cost. In the first case, there is a node after the distance of 
exactly |. In the second case, the split point having exactly the cost of | is 
located at the edge e = {vi, Vi+i). Then, there is a unique partitioning of w into 
three parts wi,e and W2 and by extending wi by e, we will get a unique parti- 
tioning of w into starting way wi and return way W2 ■ Based on this observation, 
we can stop extending ways that exceed the cost limit of | at most by one hop 
and thus, approximately work with only half of the search radius. 

Though bidirectional search is applicable for searching any maximum gain 
way w having cost{w) < r, it is especially well suited for searching round trips. 
Since the area of the graph that has to be explored for finding the starting ways 
is the same as the area being explored for finding the return ways, it is possible 
to simultaneously search for both parts of a round trip. Thus, the part of the 
graph being accessed during query processing is significantly decreased. 

Our bidirectional search algorithm employs a node tab storing pareto optimal 
ways leading to a visited node v managing two lists of undominated ways. The 
first contains all undominated ways Wi leading from the starting node s to v 
and the second manages all undominated ways starting at v and leading to s. 
Updating the node tab is used in an almost identical way as described above. 
The only difference to the above use is that we have to distinguish whether u; is a 
starting way or a return way and update the corresponding list. The processing 
order is again managed by a priority queue that is ordered by the maximum gain 
being observed in either part of the node tab. 

The algorithm proceeds as follows: At initialization, the algorithm considers 
all outlinks of the starting node s and generates a first set of starting ways. After 
updating the node tab with these ways and inserting the corresponding nodes 
into the priority queue, we use all inlinks to s to generate a first set of return 
ways and again update the node tab and the priority queue. Now the algorithm 
enters the main loop which is iterated until the priority queue is empty. In each 
iteration i, the algorithm pops the top node Vi from the queue. Afterwards, it 
retrieves all unprocessed ways from the list of starting ways leading to Vi in the 
node tab and checks if these have a cost smaller than | . If a way w is still smaller 
than ^ , the algorithm extends w by all outlinks of Vi and generates a candidate 
set Cw Each element of c € is now used to update the node tab and the 
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Fig. 4. Figure that indicates the problem of local pruning if an edge cardinality 
is constrained. A indicates the cost/gain of Pc which is added to the paths Pa,Pb- 
The cost contribution to Pa and pb is the same with different gain because Pc 
and Pa share an edge. And thus, its pruning power is lowered as well. 



priority queue in case the maximum gain of the node tab entry of the last node 
c is increased. Afterwards the algorithm retrieves all unprocessed return ways. 
Each of these ways Wret is extended by all inlinks to a set of candidate ways C^^^^ 
of ways starting at Vi and ending at the origin s. Then each candidate c € Cw^^t 
is checked whether its cost is still less or equal f • If c passes this test, c is used 
to update the node tab and the queue, if the maximum gain of the entry of its 
first node is increased. Let us note that it is important to check starting ways 
and return ways at different stages of the processing to achieve all ways sufficing 
the partitioning described above. After the priority queue is empty, i.e. there is 
no unprocessed way left that can be extended any further, we need to join the 
pareto optimal starting ways and return ways. The result set is organized in a 
list of undominated round trips which is updated by visiting all entries of the 
node tab. For each entry representing node we examine all combinations of 
undominated ways starting at w start and return ways w^et ■ For each pair of ways 
w start and Wret, wc first of all determine the cost of the corresponding round 
trip by adding cost(wstart) + cost(wret) and the corresponding gain by adding 
gain(w start) + gain{wret)- Based on this cost-gain vector, we can now update 
the result list of undominated round trips. If the new round trip is undominated, 
we join both ways and add the result to the result list. If the new round trip 
even dominates formerly pareto optimal round trips, the now dominated round 
trips are deleted. After each entry of the node tab is processed, the algorithm 
terminates and the result consists of all pareto optimal round trips with a cost 
of up to T. Figure |3] describes the bidirectional search in pseudo code. 



5 MGRQs with Redundancy Control 

In this section, we will discuss maximum gain round trip queries under the 
constraints limiting the amount and the impact of edges which occur more than 
once. In particular, we will not allow that a round trip contains the same edge 
more than k times. Furthermore, as the cost will sum up over duplicate edges 



as well, the gain of ways is calculated over the set of all edges being contained 
in the round trip. Since there are no duplicates in a set, each edge can add its 
gain only once to the round trip. Please recall that we will consider edge (w^, Vj) 
as equivalent to (vj,Vi) w.r.t. redundancy control. 

5.1 Pruning and Redundancy Control 

Since computing the cost is the same in both types of round trips, pruning ways 
w.r.t. their cost is applicable in the same way as described in Section[5] However, 
when trying to exploit the cost-gain ratio to prune ways, both redundancy control 
mechanisms have a major impact. When using redundancy control, we cannot 
guarantee that all partitions of a pareto optimal round trip r are pareto optimal 
in the same ways as described above. Figure |4] illustrates this effect in a simple 
example. Even though the way w dominates the way w, w is not part of the 
pareto optimal round trip r. However, the dominated way w is part of r and 
replacing w hy w would lead to a round trip f being dominated by r. In this 
example we can observe that the influence of w to the gain of the complete round 
trip r is not limited to the cost{w), gain^w) and its end node v. If w is extended 
into a round trip, all edges being visited on w will not add any gain and thus, 
these edges influence the cost gain ratio of the complete round trip. A similar 
observation holds for limiting the cardinality of each edge in a round trip to k. 
In this case, a pareto optimal round trip r might contain the dominated way w 
because the way w dominating w contains edges that would occur more than k 
times when replacing by w in r. In other words, the corresponding round trip 
f containing w would violate the redundancy control. 

In order to retain the possibility of local pruning, the domination relation 
defined above must be extended by the set of visited edges. Thus, to extend local 
pruning to redundancy controlled round trips, we can formulate the following 
lemma: 

Lemma 2 (dominaton under redundancy control). 

Let w = ((fi, "02), • ■ • , ("Cfc-i, Vk) be a way in G{E, V, cost, gain), then w is called 
dominated under redundancy control by another node w — ((wi, W2), • ■ • , {vk~i, Vk)) 
if the following conditions hold: 

(a) {cost{w) > cost{w) A gain{w) < gain{w)) 

V(cosi(w)) > cost(w) A gain{w) < gain{w) 

(b) For ES{w) = {e G E\e € w},ES{w) = {e e E\e e li}: ES{w) C ES{w) 

If li is dominated under redundancy control by w, then w cannot be extended 
into an element of the result set of an MGRQ for vi. 

Proof. The proof for condition (a) is identical to the proof in section^ It remains 
to show that replacing w in a round trip f by w cannot lead to a reduced gain 
due to duplicate edges or an invalid round trip. Consider the way Wret extending 
w to the way f. The additional gain being earned by traversing Wret in f can be 
described as 





gain{e) 




Fig. 5. This figure depicts a way (fii, fi2, "-3, ^^2) that can be pruned if 
gain({n2, na, 712)) = 0. In any case, additional traversals of the cycle (n2,n^, 712)) 
do not yield any gain. Thus, the way (711,712,713,712,713,712) would be pruned in 
any case. 

Since gain{e) > and ES{w) D ES{wret) Q ES{'w) ES{wret), gain[r) > 
gain{f). Furthermore, iff does not violate the redundancy parameter k, then r 
cannot violate it either because it follows from ES{w) C ES{w) that w does not 
contain any edge e with e ^ w. Thus, any violation in r would be encountered in 
f as well. 

A major issue for the usefulness of this lemma is whether there are enough 
ways that can be pruned to justify the additional effort for comparing the edge 
sets. In the following, we start off by discussing the worst case scenario and after- 
wards point out the cases in which our pruning rule still justifies the overhead. 

For values of fc < 2 and gain{e) > for each e G -E, it is impossible that any 
way can be pruned. Since the dominated way w must contain at least the same 
edges as w, gain{w) > gain{w). Due to w dominating w, we know gain{w) = 
gain{w) and cost{w) < cost{'w). Based on this observation, it follows that w 
and w must have the same set of edges, i.e. ES{w) = ES{'w) because Ve € i? : 
gain{e) > and gain{w) = gain(w). Thus, w cannot contain an additional edge 
to w. Thus, the only allowed difference between w and w is that w visits some 
edges e S ES{w) more than once. For k — 2 the only possibility that w and w 
end with the same node is w = w. 

Correspondingly, for fc > 2 and \/e £ E : gain(e) > 0, there might be ways w 
being dominated by w. In figure [5] we illustrate the types of ways being pruned. 
On the left side w extends w by a cycle consisting of new edges offering no gain. 
The example on the right hand side excludes the way w because it visits a cycle 
in w more than once. To subsume, domination under redundancy control prunes 
all ways containing cycles which do not provide gain. 

A final pruning mechanism we have to consider is the redundancy parameter 
fc. Since we can prune all ways violating the cardinality threshold fc, the number 
of ways we have to consider for further extension is usually much smaller than 
in the general case without redundancy control. The smaller the value of fc is 
chosen the stronger is its pruning power. 

To conclude, searching for maximum gain round trips under redundancy 
control can employ cost-based pruning as in Section [4j Additionally, choosing 
a small parameter value fc also has the power to prune invalid ways during the 
traversal. However, when increasing the value of fc, the number of pruned paths is 



strongly decreasing. For parameter settings, domination-based pruning justifies 
the overhead because it prevents the algorithms from exploring ways which are 
revisiting identical parts of the way multiple times. 



5.2 Algorithm for MGRQs v^rith Redundancy Control 

Both algorithms for answering MGRQs described in section |4] are easily adapt- 
able for the case of redundancy controlled round trips. To modify the algorithms, 
we first of all have to integrate the redundancy control. Therefore, the gain cal- 
culation has to be changed in order to prevent duplicate edges to add any further 
gain. Furthermore, each time the algorithm tries to extend a way w by an edge 
{vi,Vj), we have to check whether {vi,Vj) or its complement {vj,Vi) is already 
contained in w more than k times. This way, all invalid candidates are already 
pruned before they are constructed in the first place. 

In our case, we want to use domination under redundancy control as described 
above. The method for checking domination must be extended by additionally 
checking for the subset condition. However, if we employ small values of k and 
the majority of edges has a non-zero gain, it often makes sense to abandon 
pruning based on domination to avoid the computational overhead. In this case, 
the list of ways in the node tab entry for node v contains all valid ways found 
so far. 

A final difference for the first algorithm described in Section |4] is that the 
result set has to be stored in a dedicated list of pareto optimal ways. In the 
previous setting, the node tab entry of the starting node s already contains 
a pareto optimal list of round trips. However, since the pruning rules under 
redundancy control are less restrictive, the remaining dominated round trips 
must be removed before returning the result set. Let us note that this difference 
has no impact to the bidirectional algorithm because the last step joining starting 
ways and return ways has to construct a new pareto optimal result set anyway. 
After implementing the named modifications, we can employ both algorithms to 
compute MGRQs under redundancy control. 



6 Experimental Evaluation 

In our evaluation, we use data obtained from OpenStreetMapQ We preprocessed 
the data using the converter provided in [8 to remove some of the nodes having 
a degree of 2. In our tests, we examined three different areas which are popular 
for hiking: Kirchsee (GER), Jasper(AL,CA) and Grand Canyon Village(AZ,US). 
For each area, we selected a central starting poini[^ 

^ Data and Map data © OpenStreetMap (and) contributors, CC-BY-SA 
http:/ /www. OpenStreetMap. org 

^ Kirchsee: http://www.openstreetmap.org/?node=312519650 
Jasper: http:/ /www.openstreetmap.org/?node=915165849 
Grand Canyon: http://www.openstreetmap.org/?node=174618876 



For all our tests, we chose the Euclidean distance between two nodes as a cost 
criterion. To represent the gain, we considered the road type. Thus, we assigned 
each edge allowing a maximum speed of less than 30km/h with the gain of 1 
and for the rest of the edges, we assigned a gain of 0. 

Processing time. In our first set of experiments, we compare the runtime of the 
round trip search with redundancy control {rtSearchwRC) to the search allow- 
ing redundancy (rtSearch). Furthermore, we compare the bidirectional search 
(bidirectionalRTS) and the bidirectional search with redundancy control {bidi- 
rectionalRTSwRC). For rtSearchwRC and bidirectionalRTSwRC the redundancy 
parameter k was set to 1. 

Figure [6] displays the runtime of all four algorithms with increasing cost 
values for all three maps. A first observation is that all algorithms display a 
certain cost budget for which the search starts to show a super linear increase in 
search time. However, we can observe that for employing bidirectional search the 
threshold for which query processing takes less than a minute can be extended 
to a reasonably large distance for round trips, large enough to be interesting 
(up to 17 km). Furthermore, on all graphs the bidirectional search could extend 
the cost limit being processable in less than one minute in comparison to the 
unidirectional algorithm for the same query type. A final conclusion that can be 
drawn from the results is that queries without redundancy control can be pro- 
cessed much faster than those employing redundancy control. This observation 
can be explained by the fact that dominance on the cost-gain graph is a much 
stronger pruning mechanism than dominance under redundancy control. Later 
on, we will present an experiment showing that dominance under redundancy 
control is very important for larger values of k. 

Search space. Another important factor when searching ingraph data is the 
portion of the graph which has to be available in main memory. Thus we exam- 
ined the increase of the nodes visited during the search with bidirectionalRTS 
and bidirectionalRTSwRC w.r.t. the cost threshold r. The results for all three 
maps and both bidirectional search algortihms is displayed in figure [7] It can 
be seen that the portion of the graph visited increases approximately linearly 
with the threshold parameter r. Furthermore, we can observe that bidirection- 
alRTS and bidirectionalRTSwRC visit comparable portion of the graph. Let us 
note that |7(b)| is scaled differently. Since the maximum cost threshold process- 
able for bidirectionalRTSwRC is smaller than for bidirectionalRTS, we could not 
generate values for all distance thresholds we displayed in figure |7(a)| To con- 
clude, MGRTQs visit rather small portions of the graph. The high complexity 
of MGRTQs is rather caused by the large amount of possible round trips than 
by the size of the graph. 

Impact of the Redundancy Level k. Another impact factor on the search speed 
when using redundancy control is fe, which limits the cardinality of an edge. 
Figure[8]illustrates the impact of k on the retrieval times of bidirectionalRTSwRC 
on the Jasper map with varying k from 1 to 6. At first it can be observed that 
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Fig. 6. Figures 6(a) 6 (c) | and |6 (b) | show the runtime of the proposed algorithms. 
For rtSearchwRC and bidirectionalRTSwRC k was set to 1. 



setting k=l displays better runtimes and thus, the distance threshold which can 
still be computed ended at 10 km. For values of fc > 2, the maximum threshold 
which could be reached was around 8 km. An interesting result is that for all 
k > 2 the run time was approximately the same. Thus, with the exception of the 





Fig. 8. Runtime of hidirectionalRTSwRC for different values of k 



special case k = 1 the value of k does not have a strong influence on the runtime. 
The reason for this effect can be found in the pruning rule employing dominance 
under redundancy control. We will demonstrate this effect more clearly in the 
next experiment. 

Domination under Redundancy Control. In Section [5j we proposed domination 
under redundancy control to implement an additional pruning rule. We already 
explained why the impact of this rule strongly depends on the value of the redun- 
dancy parameter k. In the following, we will examine the runtime behaviour of 
our search algorithms with dominance pruning (rtSearchwRC) and {hidirection- 
alRTSwRC) and search algorithms without the dominance pruning rule sim- 
pleRTSwRC and simpleBidirectionalRTSwRC for increasing values of k. The 
result is displayed in figure [9j The experiment was executed on the Jasper map 
with T — 1 700 m and k varying from 1 to 6. Let us note that the relatively small 
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value for t was chosen to still be able to compute results with simpleRTSwRC 
and simpleBidirectionalRTSwRC for larger values of k. 

In case of fc — 1, our pruning rule cannot have any impact. Thus, it can 
be seen that the algorithms without the additional pruning rule perform better 
for values of fc =1. For increasing values of fc, we can observe an exponential 
increase of the runtime for the basic algorithms simpleRTSwRC and simpleBidi- 
rectionalRTSwRC. In contrast, rtSearchwRC and bidirectionalRTSwRC show an 
almost constant runtime behaviour for increasing values of fc. Thus, employing 
dominance pruning under redundancy control is a significant improvement for 
larger values of fc. 



7 Conclusions 

In this paper, we examined maximum gain round trip queries (MGRQs) with 
cost constraints in cost-gain networks. A cost-gain network is a graph where each 
edge is connected to a certain amount of cost and additionally provides a certain 
amount of gain. A round trip is a way starting and ending at the same node and 
a maximum gain round trip is a round trip providing the maximum gain for a 
certain cost. The result set of an MGRQ is a set of round trips containing a 
maximum gain round trip for every cost level being less or equal to a cost limit 
T. We propose solutions for two sub problems. The first deals with round trips 
where edges might occur multiple times. The second sub problem restricts the 
number of times an edge can occur in the solution to a maximum value of fc. Our 
algorithms are tested on real world map data taken from Open Street Map. For 
future work, we plan to examine further pruning mechanisms which are based on 
optimistic forward approximations similar to A*-search. Furthermore, we plan 



to examine parallel algorithms to extend the cost limit being still computable. 
Furthermore, we will develop approximative algorithms for the case that an exact 
search requires too much resources. 
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